Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HolSmt: implement div and mod, fix proof replay, fix translation #1210

Merged
merged 27 commits into from
Apr 9, 2024

Conversation

someplaceguy
Copy link
Contributor

@someplaceguy someplaceguy commented Mar 12, 2024

Hi!

This PR implements support for DIV, MOD, $/, $%, quot and rem, i.e. all div and mod operators for num and int. It also enables proof replay for all the self-tests that contained these operators. No changes were made outside HolSmt.

In total, 78 more self-tests are now passing (with Z3 v4 proof replay). In fact, I believe all arithmetic test cases are now passing, including with proof replay, with the exception of the word tests (and one which I added and mention below).

The support for the div and mod operators were implemented using the following approach:

  1. First, I defined two functions in HolSmtTheory called smtdiv and smtmod, which implement the semantics of SMT-LIB's integer division and modulo operators.
    • These two operators have different definitions than all of HOL4's div and mod operators; however smtdiv was defined based on $/ and smtmod based on $% for simplicity.
  2. A test case was added to selftests.sml which verifies that smtdiv and smtmod exactly match the SMT-LIB definition.
    • Initially, I was planning on manually proving this theorem, but it appears to be quite laborious to do so.
    • However, as always, Z3 can prove this easily (hence why this was proved in selftests.sml).
    • Unfortunately, we can't reconstruct this proof (because one of the proof steps seems to rely on nonlinear arithmetic), so the proof is not really meaningful...
  3. Theorems were added to HolSmtTheory which define HOL4's div and mod operators in terms of smtdiv and smtmod (these are easier to prove). These theorems are now added to the proving context by default, so that SMT solvers can translate HOL4's div and mod operators into the equivalent SMT-LIB div and mod operators which they natively support.
  4. SMT-LIB translation and parser support were added for smtdiv and smtmod.
  5. Support was added to the linear arithmetic proof handlers so that they can prove goals that have smtdiv and smtmod (by rewriting with their definitions into equivalent HOL4 $/ and $% operations).

This PR also implements the following:

  1. Fixed the translation of let ... and ... in ..., i.e. the one with parallel definitions, into the SMT-LIB language. This translation had never actually worked before, only let ... ; ... in ... was working properly.
  2. Fixed slowness (more like an apparent hang) due to the workaround for "intLib.ARITH_PROVE 0n = x * Num 0i" fails unexpectedly #1203. Apparently, COOPER_PROVE is much slower than ARITH_PROVE in some cases, so now we try the latter before falling back to the former.
  3. Major improvements in the arithmetic proving procedures:
  4. Implemented a workaround for a Z3 bug where it adds an hypothesis but then forgets to discharge it.
    • Fortunately, such hypotheses always seem to be of the form P = P, so it's easy to discharge them ourselves.
    • However, I am planning to file a Z3 bug (or even a PR) once I get an opportunity to debug this issue.
  5. Implemented a handler for the iff-false Z3 proof inference rule.

I'm happy with the results, although I was a bit surprised by how hard it was to fix proof replay for these tests. I was expecting Z3 to throw more curve-balls, but this time Z3 did not give me much trouble.

Instead, the major problems were the limitations in the HOL4 integer arithmetic decision procedures, especially #1207. I implemented some workarounds (i.e. I try to prove the goals using some additional tactics, including arithmetic simplifications) but I don't expect these to be reliable and in fact, if I change some theorems around a bit I do encounter some linear arithmetic steps which HOL4 cannot prove automatically even with my workarounds.

Anyway, I think this concludes my work for the near future, as the next issues to tackle are quite hard to solve.

Thanks!

cc @tjark

The SMT-LIB term translator couldn't handle parallel definitions
properly. This commit fixes that.
The previous workaround for ARITH_PROVE failing to prove some terms
changed the `th_lemma_arith` proof handler to use COOPER_PROVE
instead.

However, COOPER_PROVE sometimes is too slow and appears to hang when
replaying some Z3 proofs.

This commit changes the `th_lemma_arith` handler to first try
ARITH_PROVE which is much faster, and only if it fails, then try
COOPER_PROVE.
SMT_NUM_ADD and SMT_NUM_SUB were meant to be used in place of INT_ADD
and INT_NUM_SUB for optimization reasons.

However, this causes a regression in proof replay in one of the
self-tests. Since this regression is not straightforward to fix,
these theorems are not being used yet.

NUM_OF_INT was also added to the default set of theorems since it
helps solve some `num`-related goals.
Unfortunately, these operators don't seem to be documented.

However, from a quick reading of the Z3 source code, it seems that
they seem to be similar to `div` and `mod`, but they seem to be used
to indicate that Z3 can't prove that the divisor is non-zero.
The `th_lemma_arith` was running into several issues. The following
changes were made:

1. The `generalize_ite` procedure was removed, since the arithmetic
decision procedures can handle `if ... then ... else ...` now. It was
causing replay failures because some terms were not provable unless
the decision procedures could inspect the `if` bodies.

2. The arithmetic decision procedures don't understand `smtdiv` and
`smtmod`; so before proving the term, we now rewrite it with the
definitions of these functions.

3. A workaround was also implemented for some instances of the
following issue:

HOL-Theorem-Prover#1207

Namely, the integer decision procedures cannot decide some terms
containing `ABS` and `Num`, so we rewrite them when possible.
This commit contains a workaround for the following issue:

HOL-Theorem-Prover#1209
In some proof certificates, Z3 emits a `hypothesis` inference rule
without discharging it later with a `lemma` rule. This causes some
proofs to fail at the end of the proof replay procedure, because
they end up containing an extra hypothesis.

Fortunately, so far all such undischarged hypothesis rules were of the
form `p = p`, so for now we can easily work around this issue by
creating such theorems without adding an assumption.

This workaround allows us to enable proof replay for 5 more
self-tests.
Since these two decision procedures need to handle the same kind of
terms (including smtdiv / smtmod rewrites), this commit merges them.
This commit adds some tactics which help prove some arithmetic proof
steps that intLib.{ARITH,COOPER}_TAC can't deal with.

These are basically some workarounds for:

HOL-Theorem-Prover#1207

This allows us to enable proof replay for the remaining arithmetic
test cases that didn't already have proof replay enabled, with the
exception of the very last test case (because we can't replay
nonlinear arithmetic proof steps).

That said, I don't expect these workarounds to be very reliable.

Ideally, it would be better to improve HOL4's linear arithmetic
decision procedures.
@someplaceguy
Copy link
Contributor Author

Added one commit to update the HolSmt documentation regarding recent improvements.

@mn200
Copy link
Member

mn200 commented Mar 13, 2024

Can you put the Euclidean div and mod constants (perhaps call them ediv and emod) into integerScript.sml please?

@someplaceguy
Copy link
Contributor Author

someplaceguy commented Mar 13, 2024

Can you put the Euclidean div and mod constants (perhaps call them ediv and emod) into integerScript.sml please?

Sure, but how would you like me to do that? Just add those definitions to integerScript.sml as they are (except for the renaming)? Anything else?

For reference, I took a look at how int_quot and int_rem are defined and there seems to be a lot going on there, e.g.:

  1. int_quot is actually defined based on the natural numbers (instead of being defined based on int_div like I did). This would be easy enough to do.
  2. int_quot (and int_div) are constant specifications instead of constant definitions (which require proving a theorem, although that seems quite doable). Should I use constant specifications as well?
  3. Infix constants are also defined, although this seems trivial.
  4. Some quot-related theorems are proved. I guess I could try to prove similar ones, although it would take me some time... However, INT_QUOT_UNIQUE seems incredibly daunting to prove...
  5. rem also has quite a few theorems and a few daunting ones as well...
  6. Furthermore, these two operators also have theorems used for rewrites. I think I understand the *_REDUCE theorems but not the INT_{QUOT,REM}_CALCULATE ones.
  7. These operators also have syntax manipulation functions defined in intSyntax.{sig,sml} (this seems trivial to add).
  8. For completion I assume simplification rules are added somewhere, although I can't figure out where?
  9. And then presumably these operators have support in the omega and cooper decision procedures, although I can't even find the string "quot" in there? (And even if I did, I doubt I'd be able to figure out how to add support for ediv and emod there...).

So, which of the above should I do?

@binghe
Copy link
Member

binghe commented Mar 13, 2024

Can you put the Euclidean div and mod constants (perhaps call them ediv and emod) into integerScript.sml please?

Sure, but how would you like me to do that? Just add those definitions to integerScript.sml as they are (except for the renaming)? Anything else?

For reference, I took a look at how int_quot and int_rem are defined and there seems to be a lot going on there, e.g.:

  1. int_quot is actually defined based on the natural numbers (instead of being defined based on int_div like I did). This would be easy enough to do.
  2. int_quot (and int_div) are constant specifications instead of constant definitions (which require proving a theorem, although that seems quite doable). Should I use constant specifications as well?
  3. Infix constants are also defined, although this seems trivial.
  4. Some quot-related theorems are proved. I guess I could try to prove similar ones, although it would take me some time... However, INT_QUOT_UNIQUE seems incredibly daunting to prove...
  5. rem also has quite a few theorems and a few daunting ones as well...
  6. Furthermore, these two operators also have theorems used for rewrites. I think I understand the *_REDUCE theorems but not the INT_{QUOT,REM}_CALCULATE ones.
  7. These operators also have syntax manipulation functions defined in intSyntax.{sig,sml} (this seems trivial to add).
  8. For completion I assume simplification rules are added somewhere, although I can't figure out where?
  9. And then presumably these operators have support in the omega and cooper decision procedures, although I can't even find the string "quot" in there? (And even if I did, I doubt I'd be able to figure out how to add support for ediv and emod there...).

So, which of the above should I do?

I think @mn200 simply meant that you can move the definitions of smtdiv and smtmod (renamed to ediv and emod) and related theorems (from SMT_NUM_ADD to SMT_INT_REM), i.e. all you added into HolSmtScript.sml, to integerScript.sml. In this way, the definitions and theorems can be reused by potential other users in the future.

@mn200
Copy link
Member

mn200 commented Mar 14, 2024

@binghe has it right; what you have already is a good starting point for future work, and it would be better to have that work happen in integerTheory.

The previously-implemented workaround might have an issue where if Z3
introduces an hypothesis of the form `P = P` but then *doesn't* forget
to discharge it, we might fail to replay the `lemma` proof rule
because we avoided adding an assumption.

Instead, this commit fixes the workaround so that we remove the extra
hypothesis at the end of the proof replay procedure, i.e. only when we
are sure that Z3 *did* forget to discharge the hypothesis.
Now that integerTheory contains the definitions for Euclidean div and
mod, let's use that instead of redefining these operators in
HolSmtTheory.
Now that HOL-Theorem-Prover#1209 is fixed, we can remove the workaround for it in
HolSmt.
HOL4's integer arithmetic decision procedures cannot handle `quot` and
`rem`, so proof steps containing these symbols were failing to replay.

This commit rewrites goals containing these symbols to increase the
chances of a successful replay.

`thm_AUTO` was similarly improved, as it had the same limitation and
was failing to prove some of the self-tests.
@someplaceguy
Copy link
Contributor Author

someplaceguy commented Mar 14, 2024

Added 7 more commits:

  1. Fix a possible bug in a previously implemented workaround for a Z3 bug where a hypothesis is introduced but not discharged. Instead of avoiding to introduce the assumption, now HolSmt will get rid of it at the end of the proof replay procedure. This should make proof replay more robust, as it is possible that Z3 might introduce a hypothesis of the form P = P and then not forget to discharge it, in which case lemma might require the hypothesis to be there.
  2. Moved the Euclidean div and mod definitions to integerTheory, as well as the theorems that had been added to HolSmt, as requested by @binghe and @mn200.
  3. Changed HolSmt to use integerTheory's definition of ediv and emod, as well as the theorems that were moved there, thus removing the definitions and theorems from HolSmt.
  4. Removed workaround for intLib.ARITH_PROVE raises exception NotFound #1209 as it is now fixed (thank you @mn200!)
  5. Fixed replay of goals containing quot and rem due to intLib.{ARITH,COOPER}_PROVE can't prove certain goals #1207.
  6. Add 73 new tests exercising quot and rem, all with Z3 proof replay enabled. Perplexingly, I had forgotten to add these.
  7. Add some simps to thm_AUTO, which help make advances in proving the last self-test but do not yet allow it to succeed.

Note: I wasn't sure what to name the newly added definitions and theorems in integerTheory, nor where to put them exactly, so let me know if you'd like me to make any further changes.

Thanks!

In case `rewrite` proof rules ever come to have indices, let's ignore
them instead of erroring out.
@someplaceguy
Copy link
Contributor Author

Added one more commit which ignores indices in rewrite proof rules instead of failing to parse the proof certificates with an error. This way we don't fail to parse proof certificates in case e.g. a future version of Z3 adds an index to these proof rules.

@someplaceguy
Copy link
Contributor Author

Added one more commit which partially reverts a previous one, which had added a change that was actually unnecessary.

I realized this was unnecessary after thinking about it (and realizing it didn't make much sense), however I don't understand why I introduced that change in the first place, since I cannot reproduce those test failures and nothing relevant was changed in the mean time...

It should not be necessary to rewrite these symbols with their
definitions, because from the perspective of SMT solvers, these should
just be functions like any others. Therefore, when replaying
arithmetic proof steps, the proof handler should not need to
understand what `quot` and `rem` are.

This is a partial revert of the following commit:
6220a17

It's not clear to me why I introduced this change in the first place,
since the tests do work fine without it. Also, there doesn't seem to
be any performance regression.
Add support for parsing `declare-const` commands to the SMT-LIB
parser.
The SMT-LIB file parser was ending up raising an end-of-stream
exception unless the `(exit)` command was present in the SMT-LIB file,
even though the `(exit)` command does not seem to be mandatory
according to the SMT-LIB v2.6 standard.
@someplaceguy
Copy link
Contributor Author

Added 2 more commits, which fix issues when parsing SMT-LIB commands.

This part of the parser is used only for importing SMT-LIB problems into HOL4, which is useful for testing HolSmt but is not the main use case of HolSmt.

This commit fixes a parsing error when parsing arithmetic `th-lemma`
inference rules with an `assign-bounds` parameter, i.e.:

((_ th-lemma arith assign-bounds ...) ...)

The error was due to `assign-bounds` not being in the known term
dictionary.
@someplaceguy
Copy link
Contributor Author

Added one more commit with a simple fix, this time fixing an error when parsing some Z3 proof certificates.

Now that all known issues were fixed wrt. ARITH_PROVE not proving
goals that COOPER_PROVE could, we can remove this workaround, thus
making proof replay faster in some cases.
@someplaceguy
Copy link
Contributor Author

someplaceguy commented Mar 25, 2024

Added one commit to remove the workaround for #1203, since it's no longer necessary. Thanks @mn200!

@mn200
Copy link
Member

mn200 commented Apr 9, 2024

Thanks for all of this work!

@mn200 mn200 merged commit 84f2039 into HOL-Theorem-Prover:develop Apr 9, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants